Cross Language Query Post-Translation Expansion Based on Matrix-Weighted Association Rules
HUANG Mingxuan1,2, JIANG Caoqing1,2, HE Donglei1,2
1.Guangxi Key Laboratory Cultivation Base of Cross-Border E-commerce Intelligent information Processing, Guangxi University of Finance and Economics, Nanning 530003
2.School of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003
A computing method for matrix-weighted itemset support is proposed firstly, and the algorithm of matrix-weighted association patterns mining for cross-language query expansion is presented. Then, the algorithm of cross-language query post-translation expansion is put forward based on matrix-weighted association rules mining. The first cross-language retrieval is performed to obtain the top initially retrieved documents(TIRDs) by machine translation, and the relevance feedback documents(RFDs) are gained from TIRDs by user correlation judgment. The matrix-weighted frequent itemsets containing original query terms are mined from RFDs by means of computing support and the association rules with original query terms are extracted from frequent itemsets according to the evaluation framework of confidence-interest. To implement cross-language query post-translation expansion, the consequents or antecedents of the rules are treated as expansion terms and the importance of the expansion terms is measured by the confidence and interest of the rule. Experiments on NTCIR-5 CLIR standard test set show that the proposed algorithm improves the performance of cross-language query expansion, and it is beneficial in cross-language retrieval of long queries. The performance of post-translation consequent expansion is better than that of the antecedent one.
[1] 黄名选,严小卫,张师超.基于矩阵加权关联规则挖掘的伪相关反馈查询扩展.软件学报, 2009, 20(7): 1854-1865.
(HUANG M X, YAN X W, ZHANG S C. Query Expansion of Pseudo Relevance Feedback Based on Matrix Weighted Association Rules Mining. Journal of Software, 2009, 20(7): 1854-1865.)
[2] 伍 璇,周 栋.基于多语义关系的个性化查询扩展方法.模式识别与人工智能, 2017, 30(11): 1039-1047.
(WU X, ZHOU D. Personalized Query Expansion Method Based on Multiple Semantic Relationships.Pattern Recognition and Artificial Intelligence, 2017, 30(11): 1039-1047.)
[3] GAILLARD B, BOURAOUI J L, DE NEEF E G,et al. Query Expansion for Cross Language Information Retrieval Improvement // Proc of the 4th IEEE International Conference on Research Challenges in Information Science. Washington, USA: IEEE, 2010: 337-342.
[4] 魏 露,李书琴,李伟男,等.跨语言查询扩展优化.计算机工程与设计, 2014, 35(8): 2785-2803.
(WEI L, LI S Q, LI W N,et al. Optimization of Cross Language Query Expansion. Computer Engineering and Design, 2014,35(8):2785-2803.)
[5] BALLESTEROS L, CROFT W B. Phrasal Translation and Query Expansion Techniques for Cross Language Information Retrieval // Proc of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 1997: 84-91.
[6] MCNAMEE P, MAYFIELD J. Comparing Cross Language Query Expansion Techniques by Degrading Translation Resources // Proc of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM, 2002: 159-166.
[7] LEVOW G A. Issues in Pre and Post Translation Document Expansion: Untranslatable Cognates and Missegmented Words // Proc of the 6th International Workshop on Information Retrieval with Asian Languages. Stroudsburg, USA: ACL, 2003: 77-83.
[8] WU D, HE D Q, WANG H L,et al. Does Query Length Matter? A Comparison of Query Expansion Methods in English Chinese Cross Language Information Retrieval. Journal of Computational Information Systems, 2008, 4(3): 1213-1222.
[9] 吴 丹,何大庆,王惠临.基于伪相关反馈的跨语言查询扩展.情报学报, 2010, 29(2): 232-239.
(WU D, HE D Q, WANG H L. Cross Language Query Expansion Using Pseudo Relevance Feedback. Journal of the China Society for Scientific and Technical Information, 2010, 29(2): 232-239.)
[10] CAO G H, GAO J F, NIE J Y,et al. Extending Query Translation to Cross Language Query Expansion with Markov Chain Models // Proc of the 16th ACM Conference on Information and Knowledge Management. New York, USA: ACM, 2007: 351-360.
[11] AGRAWAL A, DR AGRAWAL A J. Improving Performance of Hindi English Based Cross Language Information Retrieval Using Selective Documents Technique and Query Expansion. International Journal of Science and Research, 2016, 5(5): 1964-1967.
[12] BELLAACHIA A, AMOR TIJANI G. Enhanced Query Expansion in English Arabic CLIR // Proc of the 19th International Workshop on Database and Expert Systems Applications. Washington, USA: IEEE, 2008: 61-66.
[13] CHINNAKOTLA M K, RAMAN K, BHATTACHARYYA P. Multilingual Pseudo Relevance Feedback: Performance Study of Assisting Languages // Proc of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2010: 1346-1356.
[14] TANG P L, ZHAO J, YU Z T,et al. A Method of Chinese and Thai Cross Lingual Query Expansion Based on Comparable Corpus. Journal of Information Processing Systems, 2017, 13(4): 805-817.
[15] CHANDRA G, DWIVEDI S K. Query Expansion Based on Term Selection for Hindi English cross lingual IR. Journal of King Saud University-Computer and Information Sciences, 2017, 29(1): 1-10.
[16] GERALDO A P, MOREIRA V P. UFRGS@CLEF2008: Using Association Rules for Cross Language Information Retrieval // Proc of the 9th Cross Language Evaluation Forum Conference on Evaluating Systems for Multilingual and Multimodal Information Access. Berlin, Germany: Springer Verlag, 2009: 66-74.
[17] 黄名选.基于加权关联模式挖掘的越-英跨语言查询扩展.情报学报, 2017, 36(3): 307-318.
(HUANG M X. Vietnamese English Cross Language Query Expansion Based on Weighted Association Patterns Mining. Journal of the China Society for Scientific and Technical Information, 2017,36(3):307-318.)
[18] 黄名选.完全加权模式挖掘与相关反馈融合的印尼汉跨语言查询扩展.小型微型计算机系统, 2017, 38(8): 1783-1791.
(HUANG M X. Indonesian Chinese Cross Language Query Expansion Based on All Weighted Patterns Mining and Relevance Feedback. Journal of Chinese Computer Systems, 2017, 38(8): 1783-1791.)
[19] LATIRI C, HADDAD H, HAMROUNI T. Towards an Effective Automatic Query Expansion Process Using an Association Rule Mining Approach.Journal of Intelligent Information Systems, 2012, 39(1): 209-247.
[20] LIU C H, QI R H, LIU Q. Query Expansion Terms Based on Positive and Negative Association Rules // Proc of the 3rd IEEE International Conference on Information Science and Technology. Wa shington,USA: IEEE, 2013: 802-808.
[21] SONG M, SONG I Y, HU X H,et al. Integration of Association Rules and Ontologies for Semantic Query Expansion. Data and Knowledge Engineering, 2007, 63(1): 63-75.
[22] SONG D W, HUANG Q, RGER S,et al. Facilitating Query Decomposition in Query Language Modeling by Association Rule Mi ningUsing Multiple Sliding Windows // Proc of the 30th European Conference on Advances in Information Retrieval. Berlin, Germany: Springer, 2008: 334-345.
[23] 周秀梅,黄名选.基于项权值变化的矩阵加权关联规则挖掘.计算机应用研究, 2015, 32(10): 2918-2923, 2929.
(ZHOU X M,HUANG M X. Matrix Weighted Association Rules Mining Based on Dynamic Weight of Item. Application Research of Computers, 2015, 32(10): 2918-2923, 2929.)
[24] AGRAWAL R, IMIELINSKI T, SWAMI A. Mining Association Rules between Sets of Items in Large Databases // Proc of the ACM SIGMOD International Conference on Management of Data. New York, USA: ACM, 1993: 207-216.
[25] 黄名选,黄发良,严小卫,等.基于项权值变化和SCCI框架的加权正负关联规则挖掘.控制与决策, 2015, 30(10): 1729-1741.
(HUANG M X, HUANG F L, YAN X W,et al. Weighted Positive and Negative Association Rules Mining Based on Dynamic Item Weight and SCCI Framework. Control and Decision, 2015,30(10):1729-1741.